{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text",
    "id": "view-in-github"
   },
   "source": [
    "<a href=\"https://colab.research.google.com/github/NeuromatchAcademy/course-content/blob/master/tutorials/W3D3_NetworkCausality/W3D3_Tutorial2.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "# Neuromatch Academy 2020 -- Week 3 Day 5, Tutorial 2\n",
    "\n",
    "# Causality Day: Correlations\n",
    "\n",
    "**Content creators**: Ari Benjamin, Tony Liu, Konrad Kording\n",
    "\n",
    "**Content reviewers**: Mike X Cohen, Madineh Sarvestani, Ella Batty, Michael Waskom\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "---\n",
    "# Tutorial objectives\n",
    "\n",
    "This is tutorial 2 on our day of examining causality. Below is the high level outline of what we'll cover today, with the sections we will focus on in this tutorial in bold:\n",
    "\n",
    "1.   Master definitions of causality\n",
    "2.   Understand that estimating causality is possible\n",
    "3.   Learn 4 different methods and understand when they fail\n",
    "    1. perturbations\n",
    "    2. **correlations**\n",
    "    3. simultaneous fitting/regression\n",
    "    5. instrumental variables\n",
    "\n",
    "### Tutorial 2 objectives\n",
    "\n",
    "In tutorial 1, we implemented and explored the dynamical system of neurons we will be working with throughout all of the tutorials today. We also learned about the \"gold standard\" of measuring causal effects through random perturbations. As random perturbations are often not possible, we will now turn to alternative methods to attempt to measure causality. We will:\n",
    "\n",
    "- Learn how to estimate connectivity from observations assuming **correlations approximate causation**\n",
    "- Show that this only works when the network is small\n",
    "\n",
    "### Tutorial 2 setting\n",
    "\n",
    "Often, we can't force neural activites or brain areas to be on or off. We just have to observe. Maybe we can get the correlation between two nodes -- is that good enough? The question we ask in this tutorial is **when is correlation a \"good enough\" substitute for causation?**\n",
    "\n",
    "The answer is not \"never\", actually, but \"sometimes\".\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "---\n",
    "# Setup"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "both",
    "colab": {},
    "colab_type": "code",
    "execution": {
     "iopub.execute_input": "2021-05-25T01:22:34.426906Z",
     "iopub.status.busy": "2021-05-25T01:22:34.425622Z",
     "iopub.status.idle": "2021-05-25T01:22:34.754126Z",
     "shell.execute_reply": "2021-05-25T01:22:34.753003Z"
    }
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import matplotlib.pyplot as plt"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form",
    "colab": {},
    "colab_type": "code",
    "execution": {
     "iopub.execute_input": "2021-05-25T01:22:34.759228Z",
     "iopub.status.busy": "2021-05-25T01:22:34.758241Z",
     "iopub.status.idle": "2021-05-25T01:22:34.846007Z",
     "shell.execute_reply": "2021-05-25T01:22:34.845422Z"
    }
   },
   "outputs": [],
   "source": [
    "#@title Figure settings\n",
    "import ipywidgets as widgets       # interactive display\n",
    "%config InlineBackend.figure_format = 'retina'\n",
    "plt.style.use(\"https://raw.githubusercontent.com/NeuromatchAcademy/course-content/master/nma.mplstyle\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form",
    "colab": {},
    "colab_type": "code",
    "execution": {
     "iopub.execute_input": "2021-05-25T01:22:34.854954Z",
     "iopub.status.busy": "2021-05-25T01:22:34.848808Z",
     "iopub.status.idle": "2021-05-25T01:22:34.874229Z",
     "shell.execute_reply": "2021-05-25T01:22:34.874665Z"
    }
   },
   "outputs": [],
   "source": [
    "# @title Helper functions\n",
    "\n",
    "\n",
    "def sigmoid(x):\n",
    "    \"\"\"\n",
    "    Compute sigmoid nonlinearity element-wise on x.\n",
    "\n",
    "    Args:\n",
    "        x (np.ndarray): the numpy data array we want to transform\n",
    "    Returns\n",
    "        (np.ndarray): x with sigmoid nonlinearity applied\n",
    "    \"\"\"\n",
    "    return 1 / (1 + np.exp(-x))\n",
    "\n",
    "\n",
    "def logit(x):\n",
    "    \"\"\"\n",
    "\n",
    "    Applies the logit (inverse sigmoid) transformation\n",
    "\n",
    "    Args:\n",
    "        x (np.ndarray): the numpy data array we want to transform\n",
    "    Returns\n",
    "        (np.ndarray): x with logit nonlinearity applied\n",
    "    \"\"\"\n",
    "    return np.log(x/(1-x))\n",
    "\n",
    "\n",
    "def create_connectivity(n_neurons, random_state=42, p=0.9):\n",
    "    \"\"\"\n",
    "    Generate our nxn causal connectivity matrix.\n",
    "\n",
    "    Args:\n",
    "        n_neurons (int): the number of neurons in our system.\n",
    "        random_state (int): random seed for reproducibility\n",
    "\n",
    "    Returns:\n",
    "        A (np.ndarray): our 0.1 sparse connectivity matrix\n",
    "    \"\"\"\n",
    "    np.random.seed(random_state)\n",
    "    A_0 = np.random.choice([0, 1], size=(n_neurons, n_neurons), p=[p, 1 - p])\n",
    "\n",
    "    # set the timescale of the dynamical system to about 100 steps\n",
    "    _, s_vals, _ = np.linalg.svd(A_0)\n",
    "    A = A_0 / (1.01 * s_vals[0])\n",
    "\n",
    "    # _, s_val_test, _ = np.linalg.svd(A)\n",
    "    # assert s_val_test[0] < 1, \"largest singular value >= 1\"\n",
    "\n",
    "    return A\n",
    "\n",
    "\n",
    "def see_neurons(A, ax):\n",
    "    \"\"\"\n",
    "    Visualizes the connectivity matrix.\n",
    "\n",
    "    Args:\n",
    "        A (np.ndarray): the connectivity matrix of shape (n_neurons, n_neurons)\n",
    "        ax (plt.axis): the matplotlib axis to display on\n",
    "\n",
    "    Returns:\n",
    "        Nothing, but visualizes A.\n",
    "    \"\"\"\n",
    "    A = A.T  # make up for opposite connectivity\n",
    "    n = len(A)\n",
    "    ax.set_aspect('equal')\n",
    "    thetas = np.linspace(0, np.pi * 2, n, endpoint=False)\n",
    "    x, y = np.cos(thetas), np.sin(thetas),\n",
    "    ax.scatter(x, y, c='k', s=150)\n",
    "    A = A / A.max()\n",
    "    for i in range(n):\n",
    "        for j in range(n):\n",
    "            if A[i, j] > 0:\n",
    "                ax.arrow(x[i], y[i], x[j] - x[i], y[j] - y[i], color='k', alpha=A[i, j], head_width=.15,\n",
    "                        width = A[i,j] / 25, shape='right', length_includes_head=True)\n",
    "    ax.axis('off')\n",
    "\n",
    "\n",
    "def simulate_neurons(A, timesteps, random_state=42):\n",
    "    \"\"\"\n",
    "    Simulates a dynamical system for the specified number of neurons and timesteps.\n",
    "\n",
    "    Args:\n",
    "        A (np.array): the connectivity matrix\n",
    "        timesteps (int): the number of timesteps to simulate our system.\n",
    "        random_state (int): random seed for reproducibility\n",
    "\n",
    "    Returns:\n",
    "        - X has shape (n_neurons, timeteps).\n",
    "    \"\"\"\n",
    "    np.random.seed(random_state)\n",
    "\n",
    "    n_neurons = len(A)\n",
    "    X = np.zeros((n_neurons, timesteps))\n",
    "\n",
    "    for t in range(timesteps - 1):\n",
    "        # solution\n",
    "        epsilon = np.random.multivariate_normal(np.zeros(n_neurons), np.eye(n_neurons))\n",
    "        X[:, t + 1] = sigmoid(A.dot(X[:, t]) + epsilon)\n",
    "\n",
    "        assert epsilon.shape == (n_neurons,)\n",
    "    return X\n",
    "\n",
    "\n",
    "def get_sys_corr(n_neurons, timesteps, random_state=42, neuron_idx=None):\n",
    "    \"\"\"\n",
    "    A wrapper function for our correlation calculations between A and R.\n",
    "\n",
    "    Args:\n",
    "        n_neurons (int): the number of neurons in our system.\n",
    "        timesteps (int): the number of timesteps to simulate our system.\n",
    "        random_state (int): seed for reproducibility\n",
    "        neuron_idx (int): optionally provide a neuron idx to slice out\n",
    "\n",
    "    Returns:\n",
    "        A single float correlation value representing the similarity between A and R\n",
    "    \"\"\"\n",
    "\n",
    "    A = create_connectivity(n_neurons, random_state)\n",
    "    X = simulate_neurons(A, timesteps)\n",
    "\n",
    "    R = correlation_for_all_neurons(X)\n",
    "\n",
    "    return np.corrcoef(A.flatten(), R.flatten())[0, 1]\n",
    "\n",
    "\n",
    "def correlation_for_all_neurons(X):\n",
    "  \"\"\"Computes the connectivity matrix for the all neurons using correlations\n",
    "\n",
    "    Args:\n",
    "        X: the matrix of activities\n",
    "\n",
    "    Returns:\n",
    "        estimated_connectivity (np.ndarray): estimated connectivity for the selected neuron, of shape (n_neurons,)\n",
    "  \"\"\"\n",
    "  n_neurons = len(X)\n",
    "  S = np.concatenate([X[:, 1:], X[:, :-1]], axis=0)\n",
    "  R = np.corrcoef(S)[:n_neurons, n_neurons:]\n",
    "  return R\n",
    "\n",
    "\n",
    "def plot_estimation_quality_vs_n_neurons(number_of_neurons):\n",
    "    \"\"\"\n",
    "    A wrapper function that calculates correlation between true and estimated connectivity\n",
    "    matrices for each number of neurons and plots\n",
    "\n",
    "    Args:\n",
    "      number_of_neurons (list): list of different number of neurons for modeling system\n",
    "      corr_func (function): Function for computing correlation\n",
    "    \"\"\"\n",
    "    corr_data = np.zeros((n_trials, len(number_of_neurons)))\n",
    "    for trial in range(n_trials):\n",
    "      print(\"simulating trial {} of {}\".format(trial + 1, n_trials))\n",
    "      for j, size in enumerate(number_of_neurons):\n",
    "          corr = get_sys_corr(size, timesteps, trial)\n",
    "          corr_data[trial, j] = corr\n",
    "\n",
    "    corr_mean = corr_data.mean(axis=0)\n",
    "    corr_std = corr_data.std(axis=0)\n",
    "\n",
    "    plt.plot(number_of_neurons, corr_mean)\n",
    "    plt.fill_between(number_of_neurons,\n",
    "                     corr_mean - corr_std,\n",
    "                     corr_mean + corr_std,\n",
    "                     alpha=.2)\n",
    "    plt.xlabel(\"Number of neurons\")\n",
    "    plt.ylabel(\"Correlation\")\n",
    "    plt.title(\"Similarity between A and R as a function of network size\")\n",
    "    plt.show()\n",
    "\n",
    "\n",
    "def plot_connectivity_matrix(A, ax=None):\n",
    "  \"\"\"Plot the (weighted) connectivity matrix A as a heatmap\n",
    "\n",
    "    Args:\n",
    "      A (ndarray): connectivity matrix (n_neurons by n_neurons)\n",
    "      ax: axis on which to display connectivity matrix\n",
    "  \"\"\"\n",
    "  if ax is None:\n",
    "    ax = plt.gca()\n",
    "  lim = np.abs(A).max()\n",
    "  im = ax.imshow(A, vmin=-lim, vmax=lim, cmap=\"coolwarm\")\n",
    "  ax.tick_params(labelsize=10)\n",
    "  ax.xaxis.label.set_size(15)\n",
    "  ax.yaxis.label.set_size(15)\n",
    "  cbar = ax.figure.colorbar(im, ax=ax, ticks=[0], shrink=.7)\n",
    "  cbar.ax.set_ylabel(\"Connectivity Strength\", rotation=90,\n",
    "                     labelpad= 20, va=\"bottom\")\n",
    "  ax.set(xlabel=\"Connectivity from\", ylabel=\"Connectivity to\")\n",
    "\n",
    "\n",
    "def plot_true_vs_estimated_connectivity(estimated_connectivity, true_connectivity, selected_neuron=None):\n",
    "  \"\"\"Visualize true vs estimated connectivity matrices\n",
    "\n",
    "  Args:\n",
    "    estimated_connectivity (ndarray): estimated connectivity (n_neurons by n_neurons)\n",
    "    true_connectivity (ndarray): ground-truth connectivity (n_neurons by n_neurons)\n",
    "    selected_neuron (int or None): None if plotting all connectivity, otherwise connectivity\n",
    "      from selected_neuron will be shown\n",
    "\n",
    "  \"\"\"\n",
    "  fig, axs = plt.subplots(1, 2, figsize=(10, 5))\n",
    "\n",
    "  if selected_neuron is not None:\n",
    "    plot_connectivity_matrix(np.expand_dims(estimated_connectivity, axis=1), ax=axs[0])\n",
    "    plot_connectivity_matrix(true_connectivity[:, [selected_neuron]], ax=axs[1])\n",
    "    axs[0].set_xticks([0])\n",
    "    axs[1].set_xticks([0])\n",
    "    axs[0].set_xticklabels([selected_neuron])\n",
    "    axs[1].set_xticklabels([selected_neuron])\n",
    "  else:\n",
    "    plot_connectivity_matrix(estimated_connectivity, ax=axs[0])\n",
    "    plot_connectivity_matrix(true_connectivity, ax=axs[1])\n",
    "\n",
    "  axs[1].set(title=\"True connectivity\")\n",
    "  axs[0].set(title=\"Estimated connectivity\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "---\n",
    "# Section 1: Small systems\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form",
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 517
    },
    "colab_type": "code",
    "execution": {
     "iopub.execute_input": "2021-05-25T01:22:34.882991Z",
     "iopub.status.busy": "2021-05-25T01:22:34.882431Z",
     "iopub.status.idle": "2021-05-25T01:22:34.916703Z",
     "shell.execute_reply": "2021-05-25T01:22:34.917167Z"
    },
    "outputId": "8e08e9a1-f31a-4746-dad7-6a8fc1a351c0"
   },
   "outputs": [],
   "source": [
    "#@title Video 1: Correlation vs causation\n",
    "# Insert the ID of the corresponding youtube video\n",
    "from IPython.display import YouTubeVideo\n",
    "video = YouTubeVideo(id=\"vjBO-S7KNPI\", width=854, height=480, fs=1)\n",
    "print(\"Video available at https://youtu.be/\" + video.id)\n",
    "video"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "## Exercise 1: Try to approximate causation with correlation\n",
    "\n",
    "In small systems, correlation can look like causation. Let's attempt to recover the true connectivity matrix (A) just by correlating the neural state at each timestep with the previous state: $C=\\vec{x_t}{\\vec{x_{t+1}}^T}$. \n",
    "\n",
    "Complete this function to estimate the connectivity matrix of a single neuron by calculating the correlation coefficients with every other neuron at the next timestep. That is, please correlate two vectors: 1) the activity of a selected neuron at time $t$ 2) The activity of all other neurons at time t+1."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "execution": {
     "iopub.execute_input": "2021-05-25T01:22:34.924163Z",
     "iopub.status.busy": "2021-05-25T01:22:34.923557Z",
     "iopub.status.idle": "2021-05-25T01:22:35.464571Z",
     "shell.execute_reply": "2021-05-25T01:22:35.464042Z"
    }
   },
   "outputs": [],
   "source": [
    "def compute_connectivity_from_single_neuron(X, selected_neuron):\n",
    "    \"\"\"\n",
    "    Computes the connectivity matrix from a single neuron neurons using correlations\n",
    "\n",
    "    Args:\n",
    "        X (ndarray): the matrix of activities\n",
    "        selected_neuron (int): the index of the selected neuron\n",
    "\n",
    "    Returns:\n",
    "        estimated_connectivity (ndarray): estimated connectivity for the selected neuron, of shape (n_neurons,)\n",
    "    \"\"\"\n",
    "\n",
    "    # Extract the current activity of selected_neuron, t\n",
    "    current_activity = X[selected_neuron, :-1]\n",
    "\n",
    "    # Extract the observed outcomes of all the neurons\n",
    "    next_activity = X[:, 1:]\n",
    "\n",
    "    # Initialize estimated connectivity matrix\n",
    "    estimated_connectivity = np.zeros(n_neurons)\n",
    "\n",
    "    # Loop through all neurons\n",
    "    for neuron_idx in range(n_neurons):\n",
    "\n",
    "        # Get the activity of neuron_idx\n",
    "        this_output_activity = next_activity[neuron_idx]\n",
    "\n",
    "        ########################################################################\n",
    "        ## TODO: Estimate the neural correlations between\n",
    "        ## this_output_activity        and       current_activity\n",
    "        ## -------------------                  ----------------\n",
    "        ##\n",
    "        ## Note that np.corrcoef returns the full correlation matrix; we want the\n",
    "        ## top right corner, which we have already provided.\n",
    "        ## FIll out function and remove\n",
    "        raise NotImplementedError('Compute neural correlations')\n",
    "        ########################################################################\n",
    "        # Compute correlation\n",
    "        correlation = np.corrcoef(...)[0, 1]\n",
    "\n",
    "        # Store this neuron's correlation\n",
    "        estimated_connectivity[neuron_idx] = correlation\n",
    "\n",
    "    return estimated_connectivity\n",
    "\n",
    "# Simulate a 6 neuron system for 5000 timesteps again.\n",
    "n_neurons = 6\n",
    "timesteps = 5000\n",
    "selected_neuron = 1\n",
    "\n",
    "# Invoke a helper function that generates our nxn causal connectivity matrix\n",
    "A = create_connectivity(n_neurons)\n",
    "\n",
    "# Invoke a helper function that simulates the neural activity\n",
    "X = simulate_neurons(A, timesteps)\n",
    "\n",
    "# Uncomment below to test your function\n",
    "# estimated_connectivity = compute_connectivity_from_single_neuron(X, selected_neuron)\n",
    "\n",
    "# plot_true_vs_estimated_connectivity(estimated_connectivity, A, selected_neuron)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 357
    },
    "colab_type": "code",
    "execution": {
     "iopub.execute_input": "2021-05-25T01:22:35.471805Z",
     "iopub.status.busy": "2021-05-25T01:22:35.471199Z",
     "iopub.status.idle": "2021-05-25T01:22:36.434336Z",
     "shell.execute_reply": "2021-05-25T01:22:36.434804Z"
    },
    "outputId": "4ea81e03-0a13-40c3-db5f-141ac9e57d5e"
   },
   "outputs": [],
   "source": [
    "# to_remove solution\n",
    "def compute_connectivity_from_single_neuron(X, selected_neuron):\n",
    "    \"\"\"\n",
    "    Computes the connectivity matrix from a single neuron neurons using correlations\n",
    "\n",
    "    Args:\n",
    "        X (ndarray): the matrix of activities\n",
    "        selected_neuron (int): the index of the selected neuron\n",
    "\n",
    "    Returns:\n",
    "        estimated_connectivity (ndarray): estimated connectivity for the selected neuron, of shape (n_neurons,)\n",
    "    \"\"\"\n",
    "\n",
    "    # Extract the current activity of selected_neuron, t\n",
    "    current_activity = X[selected_neuron, :-1]\n",
    "\n",
    "    # Extract the observed outcomes of all the neurons\n",
    "    next_activity = X[:, 1:]\n",
    "\n",
    "    # Initialize estimated connectivity matrix\n",
    "    estimated_connectivity = np.zeros(n_neurons)\n",
    "\n",
    "    # Loop through all neurons\n",
    "    for neuron_idx in range(n_neurons):\n",
    "\n",
    "        # Get the activity of neuron_idx\n",
    "        this_output_activity = next_activity[neuron_idx]\n",
    "\n",
    "        # Compute correlation\n",
    "        correlation = np.corrcoef(this_output_activity, current_activity)[0, 1]\n",
    "\n",
    "        # Store this neuron's correlation\n",
    "        estimated_connectivity[neuron_idx] = correlation\n",
    "\n",
    "    return estimated_connectivity\n",
    "\n",
    "# Simulate a 6 neuron system for 5000 timesteps again.\n",
    "n_neurons = 6\n",
    "timesteps = 5000\n",
    "selected_neuron = 1\n",
    "\n",
    "# Invoke a helper function that generates our nxn causal connectivity matrix\n",
    "A = create_connectivity(n_neurons)\n",
    "\n",
    "# Invoke a helper function that simulates the neural activity\n",
    "X = simulate_neurons(A, timesteps)\n",
    "\n",
    "# Uncomment below to test your function\n",
    "estimated_connectivity = compute_connectivity_from_single_neuron(X, selected_neuron)\n",
    "\n",
    "with plt.xkcd():\n",
    "  plot_true_vs_estimated_connectivity(estimated_connectivity, A, selected_neuron)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "## Section 1.1: Visualization for all neurons\n",
    "\n",
    "Hopefully you saw that it pretty much worked. We wrote a function that does what you just did but in matrix form, so it's a little faster. It also does all neurons at the same time (helper function `correlation_for_all_neurons`).\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form",
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 718
    },
    "colab_type": "code",
    "execution": {
     "iopub.execute_input": "2021-05-25T01:22:36.440839Z",
     "iopub.status.busy": "2021-05-25T01:22:36.440291Z",
     "iopub.status.idle": "2021-05-25T01:22:37.193803Z",
     "shell.execute_reply": "2021-05-25T01:22:37.194239Z"
    },
    "outputId": "ac00f376-a128-4f7a-c4f0-cbf354a2140f"
   },
   "outputs": [],
   "source": [
    "#@markdown Execute this cell to visualize full estimated vs true connectivity\n",
    "R = correlation_for_all_neurons(X)\n",
    "\n",
    "fig, axs = plt.subplots(1, 2, figsize=(10, 5))\n",
    "see_neurons(A, axs[0])\n",
    "plot_connectivity_matrix(A, ax=axs[1])\n",
    "plt.suptitle(\"True connectivity matrix A\")\n",
    "\n",
    "fig, axs = plt.subplots(1, 2, figsize=(10, 5))\n",
    "see_neurons(R, axs[0])\n",
    "plot_connectivity_matrix(R, ax=axs[1])\n",
    "plt.suptitle(\"Estimated connectivity matrix R\");"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "That pretty much worked too. Let's quantify how much it worked. \n",
    "\n",
    "We'll calculate the correlation coefficient between the true connectivity and the actual connectivity;"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 34
    },
    "colab_type": "code",
    "execution": {
     "iopub.execute_input": "2021-05-25T01:22:37.200846Z",
     "iopub.status.busy": "2021-05-25T01:22:37.199576Z",
     "iopub.status.idle": "2021-05-25T01:22:37.202704Z",
     "shell.execute_reply": "2021-05-25T01:22:37.202204Z"
    },
    "outputId": "a3bbaef8-898d-4e9f-cc15-5ca8fb153bc9"
   },
   "outputs": [],
   "source": [
    "print(\"Correlation matrix of A and R:\", np.corrcoef(A.flatten(), R.flatten())[0, 1])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "It *appears* in our system that correlation captures causality."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form",
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 517
    },
    "colab_type": "code",
    "execution": {
     "iopub.execute_input": "2021-05-25T01:22:37.208003Z",
     "iopub.status.busy": "2021-05-25T01:22:37.207414Z",
     "iopub.status.idle": "2021-05-25T01:22:37.241674Z",
     "shell.execute_reply": "2021-05-25T01:22:37.242137Z"
    },
    "outputId": "f74723f8-6e3b-4848-b3e1-f48dfebd0770"
   },
   "outputs": [],
   "source": [
    "#@title Video 2: Correlation ~ causation for small systems\n",
    "# Insert the ID of the corresponding youtube video\n",
    "from IPython.display import YouTubeVideo\n",
    "video = YouTubeVideo(id=\"eWLOnTUe9SM\", width=854, height=480, fs=1)\n",
    "print(\"Video available at https://youtu.be/\" + video.id)\n",
    "video"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "**Video correction**: the connectivity graph plots and associated explanations in this and other videos show the wrong direction of connectivity (the arrows should be pointing the opposite direction). This has been fixed in the figures above."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "---\n",
    "# Section 2: Large systems\n",
    "\n",
    "As our system becomes more complex however, correlation fails to capture causality."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "## Section 2.1: Failure of correlation in complex systems\n",
    "\n",
    "Let's jump to a much bigger system. Instead of 6 neurons, we will now use 100 neurons. How does the estimation quality of the connectivity matrix change? "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form",
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 542
    },
    "colab_type": "code",
    "execution": {
     "iopub.execute_input": "2021-05-25T01:22:37.248719Z",
     "iopub.status.busy": "2021-05-25T01:22:37.248076Z",
     "iopub.status.idle": "2021-05-25T01:22:41.582548Z",
     "shell.execute_reply": "2021-05-25T01:22:41.582032Z"
    },
    "outputId": "ad0d7b58-64e4-4b95-d520-ed2718f01674"
   },
   "outputs": [],
   "source": [
    "# @markdown Execute this cell to simulate large system, estimate connectivity matrix with correlation and return estimation quality\n",
    "\n",
    "# Simulate a 100 neuron system for 5000 timesteps.\n",
    "n_neurons = 100\n",
    "timesteps = 5000\n",
    "random_state = 42\n",
    "\n",
    "A = create_connectivity(n_neurons, random_state)\n",
    "X = simulate_neurons(A, timesteps)\n",
    "R = correlation_for_all_neurons(X)\n",
    "\n",
    "print(\"Correlation matrix of A and R:\", np.corrcoef(A.flatten(), R.flatten())[0, 1])\n",
    "\n",
    "fig, axs = plt.subplots(1, 2, figsize=(16, 8))\n",
    "plot_connectivity_matrix(A, ax=axs[0])\n",
    "axs[0].set_title(\"True connectivity matrix A\")\n",
    "plot_connectivity_matrix(R, ax=axs[1])\n",
    "axs[1].set_title(\"Estimated connectivity matrix R\");"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form",
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 517
    },
    "colab_type": "code",
    "execution": {
     "iopub.execute_input": "2021-05-25T01:22:41.587602Z",
     "iopub.status.busy": "2021-05-25T01:22:41.586509Z",
     "iopub.status.idle": "2021-05-25T01:22:41.623410Z",
     "shell.execute_reply": "2021-05-25T01:22:41.622928Z"
    },
    "outputId": "78315c0f-fa0c-48b4-ba5b-054fbe9bf4c6"
   },
   "outputs": [],
   "source": [
    "#@title Video 3: Correlation vs causation in large systems\n",
    "# Insert the ID of the corresponding youtube video\n",
    "from IPython.display import YouTubeVideo\n",
    "video = YouTubeVideo(id=\"U4sV-7g8T08\", width=854, height=480, fs=1)\n",
    "print(\"Video available at https://youtu.be/\" + video.id)\n",
    "video"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "## Section 2.2: Correlation as a function of network size\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### Interactive Demo: Connectivity estimation as a function of number of neurons\n",
    "\n",
    "Instead of looking at just a few neurons (6) or a lot of neurons (100), as above, we will now systematically vary the number of neurons and plot the resulting changes in correlation coefficient between the true and estimated connectivity matrices. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form",
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 390,
     "referenced_widgets": [
      "cf22aa4fb4ed41beab807ca87e716523",
      "19bc1ab77d994a9f8171b043f6781b77",
      "5b1b37e485a442acbad88c4f2e1cad78",
      "7ae8964565a0490895a9475f33a12355",
      "c2fd248d9e4b46509adc03064fccc26e",
      "1a90b909721a455391862f96ca684784",
      "c938b106a53d409baade3b3fbc43f283"
     ]
    },
    "colab_type": "code",
    "execution": {
     "iopub.execute_input": "2021-05-25T01:22:41.647979Z",
     "iopub.status.busy": "2021-05-25T01:22:41.631112Z",
     "iopub.status.idle": "2021-05-25T01:22:42.445142Z",
     "shell.execute_reply": "2021-05-25T01:22:42.444641Z"
    },
    "outputId": "dec00852-4aaa-4e35-8229-df1aa7d1edc1"
   },
   "outputs": [],
   "source": [
    "#@markdown Execute this cell to enable demo\n",
    "@widgets.interact(n_neurons=(6, 42, 3))\n",
    "def plot_corrs(n_neurons=6):\n",
    "  fig, axs = plt.subplots(1, 3, figsize=(15, 5))\n",
    "  timesteps = 2000\n",
    "  random_state = 42\n",
    "  A = create_connectivity(n_neurons, random_state)\n",
    "  X = simulate_neurons(A, timesteps)\n",
    "  R = correlation_for_all_neurons(X)\n",
    "  corr = np.corrcoef(A.flatten(), R.flatten())[0, 1]\n",
    "  plot_connectivity_matrix(A, ax=axs[0])\n",
    "  plot_connectivity_matrix(R, ax=axs[1])\n",
    "  axs[0].set_title(\"True connectivity\")\n",
    "  axs[1].set_title(\"Estimated connectivity\")\n",
    "  axs[2].text(0, 0.5, \"Correlation : {:.2f}\".format(corr), size=15)\n",
    "  axs[2].axis('off')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "Of course there is some variability due to randomness in $A$. Let's average over a few trials and find the relationship."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form",
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 556
    },
    "colab_type": "code",
    "execution": {
     "iopub.execute_input": "2021-05-25T01:22:42.451257Z",
     "iopub.status.busy": "2021-05-25T01:22:42.450711Z",
     "iopub.status.idle": "2021-05-25T01:22:49.460382Z",
     "shell.execute_reply": "2021-05-25T01:22:49.459895Z"
    },
    "outputId": "a3795377-ea62-494e-8647-03aac2567acd"
   },
   "outputs": [],
   "source": [
    "#@markdown Execute this cell to plot connectivity estimation as a function of network size\n",
    "\n",
    "n_trials = 5\n",
    "timesteps = 1000  # shorter timesteps for faster running time\n",
    "number_of_neurons = [5, 10, 25, 50, 100]\n",
    "plot_estimation_quality_vs_n_neurons(number_of_neurons)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### Interactive Demo: Connectivity estimation as a function of the sparsity of $A$\n",
    "\n",
    "You may rightly wonder if correlation only fails for large systems for certain types of $A$. In this interactive demo, you can examine connectivity estimation as a function of the sparsity of $A$. Does connectivity estimation get better or worse with less sparsity?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form",
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 390,
     "referenced_widgets": [
      "f0f38ab354e54cf3bf1ee3dc335e044e",
      "445881f49ccd4af9b816b432ddee4482",
      "b0aec106d1ff4065b52b8266a0a5c91b",
      "7a67e65c1157417883e4dfff3b794c7c",
      "05598b95bf124267a5d9a3b9f1e0a24c",
      "259bcd68016f4b49b67d1ffbaa3b2701",
      "6114d921a2554036b638a415625831d6"
     ]
    },
    "colab_type": "code",
    "execution": {
     "iopub.execute_input": "2021-05-25T01:22:49.485985Z",
     "iopub.status.busy": "2021-05-25T01:22:49.485207Z",
     "iopub.status.idle": "2021-05-25T01:22:50.259668Z",
     "shell.execute_reply": "2021-05-25T01:22:50.260138Z"
    },
    "outputId": "2ddcb0ac-5aaf-4eb9-fa45-8e46748c7b87"
   },
   "outputs": [],
   "source": [
    "#@title\n",
    "#@markdown Execute this cell to enable demo\n",
    "@widgets.interact(sparsity=(0.01, 0.99, .01))\n",
    "def plot_corrs(sparsity=0.9):\n",
    "  fig, axs = plt.subplots(1, 3, figsize=(15, 5))\n",
    "  timesteps = 2000\n",
    "  random_state = 42\n",
    "  n_neurons = 25\n",
    "  A = create_connectivity(n_neurons, random_state, sparsity)\n",
    "  X = simulate_neurons(A, timesteps)\n",
    "  R = correlation_for_all_neurons(X)\n",
    "  corr=np.corrcoef(A.flatten(), R.flatten())[0, 1]\n",
    "  plot_connectivity_matrix(A, ax=axs[0])\n",
    "  plot_connectivity_matrix(R, ax=axs[1])\n",
    "  axs[0].set_title(\"True connectivity\")\n",
    "  axs[1].set_title(\"Estimated connectivity\")\n",
    "  axs[2].text(0, 0.5, \"Correlation : {:.2f}\".format(corr), size=15)\n",
    "  axs[2].axis('off')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "---\n",
    "# Discussion questions\n",
    "\n",
    "Here are some questions you might chat about before break:\n",
    "\n",
    "\n",
    "\n",
    "*   Think of a research paper you've written. Did it use previous causal knowledge (e.g. a mechanism), or ask a causal question? Try phrasing that causal relationship in the language of an intervention. (\"*If I were to force $A$ to be $A'$, $B$ would...*\")\n",
    "*   What methods for interventions exist in your area of neuroscience?\n",
    "*   Think about these common \"filler\" words. Do they imply a causal relationship, in its interventional definition? (*regulates, mediates, generates, modulates, shapes, underlies, produces, encodes, induces, enables, ensures, supports, promotes, determines*)\n",
    "*   What dimensionality would you (very roughly) estimate the brain to be? Would you expect correlations between neurons to give you their connectivity? Why?\n",
    "\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "---\n",
    "# Summary"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form",
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 517
    },
    "colab_type": "code",
    "execution": {
     "iopub.execute_input": "2021-05-25T01:22:50.268545Z",
     "iopub.status.busy": "2021-05-25T01:22:50.267972Z",
     "iopub.status.idle": "2021-05-25T01:22:50.303221Z",
     "shell.execute_reply": "2021-05-25T01:22:50.302634Z"
    },
    "outputId": "77097b12-c009-4f67-b347-485d259a75d4"
   },
   "outputs": [],
   "source": [
    "#@title Video 4: Summary\n",
    "# Insert the ID of the corresponding youtube video\n",
    "from IPython.display import YouTubeVideo\n",
    "video = YouTubeVideo(id=\"hRyAN3yak_U\", width=854, height=480, fs=1)\n",
    "print(\"Video available at https://youtu.be/\" + video.id)\n",
    "video"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "Now for the takeaway. We know that for large systems correlation ≠ causation. But what about when we coarsely sample the large system? Do we get better at estimating the *effective* causal interaction between groups (=average of weights) from the correlation between the groups?\n",
    "\n",
    "From our simulation above, the answer appears to be no: as the number of neurons per group increases, we don't see any significant increase in our ability to estimate the causal interaction between groups."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "---\n",
    "# Appendix\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "## Correlation as similarity metric\n",
    "\n",
    "We'd like to note here that though we make use of Pearson correlation coefficients throughout all of our tutorials to measure similarity between our estimated connectivity matrix $R$ and the ground truth connectivity $A$, this is not strictly correct usage of Pearson correlations as elements of $A$ are not normally distributed (they are in fact binary).\n",
    "\n",
    "We use Pearson correlations as they are quick and easy to compute within the Numpy framework and provide qualitatively similar results to other correlation metrics. Other ways to compute similarities:\n",
    "- [Spearman rank correlations](https://en.wikipedia.org/wiki/Spearman%27s_rank_correlation_coefficient), which does not require normally distributed data\n",
    "- dichotomizing our estimated matrix $R$ by the median and then running concordance analysis, such as computing [Cohen's kappa](https://en.wikipedia.org/wiki/Cohen%27s_kappa)\n",
    "\n",
    "Another thing to consider: all we want is some measure of the similarity between $A$ and $R$. Element-wise comparisons are one way to do this, but are there other ways you can think of? What about matrix similarities?\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "---\n",
    "## (Bonus) Advanced Section 3: Low resolution systems\n",
    "\n",
    "Please come back to this section only if you get through all the other material during tutorials (or later on your own). Or, just scroll through and see the example solutions.\n",
    "\n",
    "A common situation in neuroscience is that you observe the *average* activity of large groups of neurons. (Think fMRI, EEG, LFP, etc.)\n",
    "We're going to simulate this effect, and ask if correlations work to recover the average causal effect of groups of neurons or areas.\n",
    "\n",
    "**Note on the quality of the analogy**: This is not intended as a perfect analogy of the brain or fMRI. Instead, we want to ask: *in a big system in which correlations fail to estimate causality, can you at least recover average connectivity between groups?*\n",
    "\n",
    "**Some brainy differences to remember**:\n",
    "We are assuming that the connectivity is random. In real brains, the neurons that are averaged have correlated input and output connectivities. This will improve the correspondence between correlations and causality for the average effect because the system has a lower true dimensionality. However, in real brains the system is also order of magnitudes larger than what we examine here, and the experimenter never has the fully-observed system.\n",
    "\n",
    "## Simulate a large system\n",
    "\n",
    "Execute the next cell to simulate a large system of 256 neurons for 10000 timesteps - it will take a bit of time to finish so move on as it runs.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form",
    "colab": {},
    "colab_type": "code",
    "execution": {
     "iopub.execute_input": "2021-05-25T01:22:50.307602Z",
     "iopub.status.busy": "2021-05-25T01:22:50.307034Z",
     "iopub.status.idle": "2021-05-25T01:24:20.819599Z",
     "shell.execute_reply": "2021-05-25T01:24:20.820108Z"
    }
   },
   "outputs": [],
   "source": [
    "# @markdown Execute this cell to simulate a large system\n",
    "\n",
    "n_neurons = 256\n",
    "timesteps = 10000\n",
    "random_state = 42\n",
    "\n",
    "A = create_connectivity(n_neurons, random_state)\n",
    "X = simulate_neurons(A, timesteps)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "### Section 3.1: Coarsely sample the system\n",
    "\n",
    "We don't observe this system. Instead, we observe the average activity of groups.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "#### (Bonus) Exercise: Compute average activity across groups and compare resulting connectivity to the truth\n",
    " \n",
    "Let's get a new matrix `coarse_X` that has 16 groups, each reflecting the average activity of 16 neurons (since there are 256 neurons in total).\n",
    "\n",
    "We will then define the true coarse connectivity as the average of the neuronal connection strengths between groups. We'll compute the correlation between our coarsely sampled groups to estimate the connectivity and compare with the true connectivity."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {},
    "colab_type": "code",
    "execution": {
     "iopub.execute_input": "2021-05-25T01:24:20.826491Z",
     "iopub.status.busy": "2021-05-25T01:24:20.825917Z",
     "iopub.status.idle": "2021-05-25T01:24:20.829126Z",
     "shell.execute_reply": "2021-05-25T01:24:20.829586Z"
    }
   },
   "outputs": [],
   "source": [
    "def get_coarse_corr(n_groups, X):\n",
    "    \"\"\"\n",
    "    A wrapper function for our correlation calculations between coarsely sampled\n",
    "    A and R.\n",
    "\n",
    "    Args:\n",
    "        n_groups (int): the number of groups. should divide the number of neurons evenly\n",
    "        X: the simulated system\n",
    "\n",
    "    Returns:\n",
    "        A single float correlation value representing the similarity between A and R\n",
    "        ndarray: estimated connectivity matrix\n",
    "        ndarray: true connectivity matrix\n",
    "    \"\"\"\n",
    "\n",
    "    ############################################################################\n",
    "    ## TODO: Insert your code here to get coarsely sampled X\n",
    "    # Fill out function then remove\n",
    "    raise NotImplementedError('Student exercise: please complete get_coarse_corr')\n",
    "    ############################################################################\n",
    "    coarse_X = ...\n",
    "\n",
    "    # Make sure coarse_X is the right shape\n",
    "    assert coarse_X.shape == (n_groups, timesteps)\n",
    "\n",
    "    # Estimate connectivity from coarse system\n",
    "    R = correlation_for_all_neurons(coarse_X)\n",
    "\n",
    "    # Compute true coarse connectivity\n",
    "    coarse_A = A.reshape(n_groups, n_neurons // n_groups, n_groups, n_neurons // n_groups).mean(3).mean(1)\n",
    "\n",
    "    # Compute true vs estimated connectivity correlation\n",
    "    corr = np.corrcoef(coarse_A.flatten(), R.flatten())[0, 1]\n",
    "\n",
    "    return corr, R, coarse_A\n",
    "\n",
    "\n",
    "n_groups = 16\n",
    "\n",
    "# Uncomment below to test your function\n",
    "# corr, R, coarse_A = get_coarse_corr(n_groups, X)\n",
    "# plot_true_vs_estimated_connectivity(R, coarse_A)\n",
    "# print(\"Correlation: {}\".format(corr))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 337
    },
    "colab_type": "code",
    "execution": {
     "iopub.execute_input": "2021-05-25T01:24:20.835760Z",
     "iopub.status.busy": "2021-05-25T01:24:20.835152Z",
     "iopub.status.idle": "2021-05-25T01:24:21.318241Z",
     "shell.execute_reply": "2021-05-25T01:24:21.317731Z"
    },
    "outputId": "b9f26209-1d70-47fb-e999-6b97ca12c88b"
   },
   "outputs": [],
   "source": [
    "# to_remove solution\n",
    "def get_coarse_corr(n_groups, X):\n",
    "    \"\"\"\n",
    "    A wrapper function for our correlation calculations between coarsely sampled\n",
    "    A and R.\n",
    "\n",
    "    Args:\n",
    "        n_groups (int): the number of groups. should divide the number of neurons evenly\n",
    "        X: the simulated system\n",
    "\n",
    "    Returns:\n",
    "        A single float correlation value representing the similarity between A and R\n",
    "        ndarray: estimated connectivity matrix\n",
    "        ndarray: true connectivity matrix\n",
    "    \"\"\"\n",
    "\n",
    "    coarse_X = X.reshape(n_groups, n_neurons // n_groups, timesteps).mean(1)\n",
    "\n",
    "    # Make sure coarse_X is the right shape\n",
    "    assert coarse_X.shape == (n_groups, timesteps)\n",
    "\n",
    "    # Estimate connectivity from coarse system\n",
    "    R = correlation_for_all_neurons(coarse_X)\n",
    "\n",
    "    # Compute true coarse connectivity\n",
    "    coarse_A = A.reshape(n_groups, n_neurons // n_groups, n_groups, n_neurons // n_groups).mean(3).mean(1)\n",
    "\n",
    "    # Compute true vs estimated connectivity correlation\n",
    "    corr = np.corrcoef(coarse_A.flatten(), R.flatten())[0, 1]\n",
    "\n",
    "    return corr, R, coarse_A\n",
    "\n",
    "\n",
    "n_groups = 16\n",
    "\n",
    "# Uncomment below to test your function\n",
    "corr, R, coarse_A = get_coarse_corr(n_groups, X)\n",
    "plot_true_vs_estimated_connectivity(R, coarse_A)\n",
    "print(\"Correlation: {}\".format(corr))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "How close is the estimated coarse connectivity matrix to the truth?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "We will now look at the estimation quality for different levels of coarseness when averaged over 3 trials."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "cellView": "form",
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 483
    },
    "colab_type": "code",
    "execution": {
     "iopub.execute_input": "2021-05-25T01:24:21.327052Z",
     "iopub.status.busy": "2021-05-25T01:24:21.326468Z",
     "iopub.status.idle": "2021-05-25T01:24:39.879055Z",
     "shell.execute_reply": "2021-05-25T01:24:39.878560Z"
    },
    "outputId": "13bce303-8f6e-40c8-c386-51d4622ac507"
   },
   "outputs": [],
   "source": [
    "#@markdown Execute this cell to visualize plot\n",
    "\n",
    "n_neurons = 128\n",
    "timesteps = 5000\n",
    "n_trials = 3\n",
    "groups = [2 ** i for i in range(2, int(np.log2(n_neurons)))]\n",
    "\n",
    "corr_data = np.zeros((n_trials, len(groups)))\n",
    "\n",
    "for trial in range(n_trials):\n",
    "    print(\"Trial {} out of {}\".format(trial, n_trials))\n",
    "    A = create_connectivity(n_neurons, random_state=trial)\n",
    "    X = simulate_neurons(A, timesteps, random_state=trial)\n",
    "    for j, n_groups in enumerate(groups):\n",
    "        corr_data[trial, j], _, _ = get_coarse_corr(n_groups, X)\n",
    "\n",
    "corr_mean = corr_data.mean(axis=0)\n",
    "corr_std = corr_data.std(axis=0)\n",
    "\n",
    "plt.plot(np.divide(n_neurons, groups), corr_mean)\n",
    "plt.fill_between(np.divide(n_neurons, groups),\n",
    "                  corr_mean - corr_std,\n",
    "                  corr_mean + corr_std,\n",
    "                  alpha=.2)\n",
    "\n",
    "plt.ylim([-0.2, 1])\n",
    "plt.xlabel(\"Number of neurons per group ({} total neurons)\".format(n_neurons),\n",
    "        fontsize=15)\n",
    "plt.ylabel(\"Correlation of estimated effective connectivity\")\n",
    "plt.title(\"Connectivity estimation performance vs coarseness of sampling\")\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "colab_type": "text"
   },
   "source": [
    "We know that for large systems correlation ≠ causation. Here, we have looked at what happens we coarsely sample a large system. Do we get better at estimating the *effective* causal interaction between groups (=average of weights) from the correlation between the groups?\n",
    "\n",
    "From our simulation above, the answer appears to be no: as the number of neurons per group increases, we don't see any significant increase in our ability to estimate the causal interaction between groups."
   ]
  }
 ],
 "metadata": {
  "colab": {
   "collapsed_sections": [],
   "include_colab_link": true,
   "name": "W3D5_Tutorial2",
   "provenance": [],
   "toc_visible": true
  },
  "kernel": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "kernelspec": {
   "display_name": "Python 3",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
